semi-unsupervised learning
Scientists detect alien signals coming from 5 nearby stars
Are we alone in the universe? Scientists may have just moved us closer to answering this question. The team – led by researchers from the University of Toronto – has streamlined the search for extraterrestrial life by using a new algorithm to organize the data from their telescopes into categories, in order to distinguish between real signals and interference. This has allowed them to quickly sort through the information and find patterns, through an artificial intelligence process known as machine learning. They discovered eight extraterrestrial signals that seem to have the hallmarks of technology.
Semi-unsupervised Learning for Time Series Classification
Davidson, Padraig, Steininger, Michael, Huhn, André, Krause, Anna, Hotho, Andreas
Time series are ubiquitous and therefore inherently hard to analyze and ultimately to label or cluster. With the rise of the Internet of Things (IoT) and its smart devices, data is collected in large amounts any given second. The collected data is rich in information, as one can detect accidents (e.g. cars) in real time, or assess injury/sickness over a given time span (e.g. health devices). Due to its chaotic nature and massive amounts of datapoints, timeseries are hard to label manually. Furthermore new classes within the data could emerge over time (contrary to e.g. handwritten digits), which would require relabeling the data. In this paper we present SuSL4TS, a deep generative Gaussian mixture model for semi-unsupervised learning, to classify time series data. With our approach we can alleviate manual labeling steps, since we can detect sparsely labeled classes (semi-supervised) and identify emerging classes hidden in the data (unsupervised). We demonstrate the efficacy of our approach with established time series classification datasets from different domains.
Semi-Unsupervised Learning with Deep Generative Models: Clustering and Classifying using Ultra-Sparse Labels
Willetts, Matthew, Roberts, Stephen J, Holmes, Christopher C
We introduce $\textit{semi-unsupervised learning}$, an extreme case of semi-supervised learning with ultra-sparse categorisation where some classes have no labels in the training set. That is, in the training data some classes are sparsely labelled and other classes appear only as unlabelled data. Many real-world datasets are conceivably of this type. We demonstrate that effective learning in this regime is only possible when a model is capable of capturing both semi-supervised and unsupervised learning. We develop two deep generative models for classification in this regime that extend previous deep generative models designed for semi-supervised learning. By changing their probabilistic structure to contain a mixture of Gaussians in their continuous latent space, these new models can learn in both unsupervised and semi-unsupervised paradigms. We demonstrate their performance both for semi-unsupervised and unsupervised learning on various standard datasets. We show that our models can learn in an semi-unsupervised manner on Fashion-MNIST. Here we artificially mask out all labels for half of the classes of data and keep $2\%$ of labels for the remaining classes. Our model is able to learn effectively, obtaining a trained classifier with $(77.2\pm1.3)\%$ test set accuracy. We also can train on Fashion-MNIST unsupervised, obtaining $(75.2\pm1.5)\%$ test set accuracy. Additionally, doing the same for MNIST unsupervised we get $(96.3\pm0.9)\%$ test set accuracy, which is state-of-the art for fully probabilistic deep generative models.
Semi-unsupervised Learning of Human Activity using Deep Generative Models
Willetts, Matthew, Doherty, Aiden, Roberts, Stephen, Holmes, Chris
Here we demonstrate a new deep generative model for classification. We introduce `semi-unsupervised learning', a problem regime related to transfer learning and zero/few shot learning where, in the training data, some classes are sparsely labelled and others entirely unlabelled. Models able to learn from training data of this type are potentially of great use, as many medical datasets are `semi-unsupervised'. Our model demonstrates superior semi-unsupervised classification performance on MNIST to model M2 from Kingma and Welling (2014). We apply the model to human accelerometer data, performing activity classification and structure discovery on windows of time series data.